Search CORE

28 research outputs found

A Unifying Framework of Bilinear LSTMs

Author: Low Bryan Kian Hsiang
Rajpal Mohit
Publication venue
Publication date: 22/10/2019
Field of study

This paper presents a novel unifying framework of bilinear LSTMs that can represent and utilize the nonlinear interaction of the input features present in sequence datasets for achieving superior performance over a linear LSTM and yet not incur more parameters to be learned. To realize this, our unifying framework allows the expressivity of the linear vs. bilinear terms to be balanced by correspondingly trading off between the hidden state vector size vs. approximation quality of the weight matrix in the bilinear term so as to optimize the performance of our bilinear LSTM, while not incurring more parameters to be learned. We empirically evaluate the performance of our bilinear LSTM in several language-based sequence learning tasks to demonstrate its general applicability

arXiv.org e-Print Archive

Federated Zeroth-Order Optimization using Trajectory-Informed Surrogate Gradients

Author: Dai Zhongxiang
Lin Xiaoqiang
Low Bryan Kian Hsiang
Shu Yao
Publication venue
Publication date: 08/08/2023
Field of study

Federated optimization, an emerging paradigm which finds wide real-world applications such as federated learning, enables multiple clients (e.g., edge devices) to collaboratively optimize a global function. The clients do not share their local datasets and typically only share their local gradients. However, the gradient information is not available in many applications of federated optimization, which hence gives rise to the paradigm of federated zeroth-order optimization (ZOO). Existing federated ZOO algorithms suffer from the limitations of query and communication inefficiency, which can be attributed to (a) their reliance on a substantial number of function queries for gradient estimation and (b) the significant disparity between their realized local updates and the intended global updates. To this end, we (a) introduce trajectory-informed gradient surrogates which is able to use the history of function queries during optimization for accurate and query-efficient gradient estimation, and (b) develop the technique of adaptive gradient correction using these gradient surrogates to mitigate the aforementioned disparity. Based on these, we propose the federated zeroth-order optimization using trajectory-informed surrogate gradients (FZooS) algorithm for query- and communication-efficient federated ZOO. Our FZooS achieves theoretical improvements over the existing approaches, which is supported by our real-world experiments such as federated black-box adversarial attack and federated non-differentiable metric optimization

arXiv.org e-Print Archive

Nonmyopic ϵ-Bayes-Optimal Active Learning of Gaussian Processes

Author: Hoang Trong Nghia
Jaillet Patrick
Kankanhalli Mohan
Low Bryan Kian Hsiang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

A fundamental issue in active learning of Gaussian processes is that of the exploration-exploitation trade-off. This paper presents a novel nonmyopic ϵ-Bayes-optimal active learning (ϵ-BAL) approach that jointly and naturally optimizes the trade-off. In contrast, existing works have primarily developed myopic/greedy algorithms or performed exploration and exploitation separately. To perform active learning in real time, we then propose an anytime algorithm based on ϵ-BAL with performance guarantee and empirically demonstrate using synthetic and real-world datasets that, with limited budget, it outperforms the state-of-the-art algorithms.Singapore. National Research Foundation (Singapore-MIT Alliance for Research and Technology Center

DSpace@MIT

Top- $k$ Ranking Bayesian Optimization

Author: Jaillet Patrick
Low Bryan Kian Hsiang
Nguyen Quoc Phong
Tay Sebastian
Publication venue
Publication date: 19/12/2020
Field of study

This paper presents a novel approach to top-

k

ranking Bayesian optimization (top-

k

ranking BO) which is a practical and significant generalization of preferential BO to handle top-

k

ranking and tie/indifference observations. We first design a surrogate model that is not only capable of catering to the above observations, but is also supported by a classic random utility model. Another equally important contribution is the introduction of the first information-theoretic acquisition function in BO with preferential observation called multinomial predictive entropy search (MPES) which is flexible in handling these observations and optimized for all inputs of a query jointly. MPES possesses superior performance compared with existing acquisition functions that select the inputs of a query one at a time greedily. We empirically evaluate the performance of MPES using several synthetic benchmark functions, CIFAR-

10

dataset, and SUSHI preference dataset.Comment: 35th AAAI Conference on Artificial Intelligence (AAAI 2021), Extended version with derivations, 13 page

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Hessian-Aware Bayesian Optimization for Decision Making Systems

Author: Low Bryan Kian Hsiang
Rajpal Mohit
Tran Lac Gia
Zhang Yehong
Publication venue
Publication date: 17/08/2023
Field of study

Many approaches for optimizing decision making systems rely on gradient based methods requiring informative feedback from the environment. However, in the case where such feedback is sparse or uninformative, such approaches may result in poor performance. Derivative-free approaches such as Bayesian Optimization mitigate the dependency on the quality of gradient feedback, but are known to scale poorly in the high-dimension setting of complex decision making systems. This problem is exacerbated if the system requires interactions between several actors cooperating to accomplish a shared goal. To address the dimensionality challenge, we propose a compact multi-layered architecture modeling the dynamics of actor interactions through the concept of role. Additionally, we introduce Hessian-aware Bayesian Optimization to efficiently optimize the multi-layered architecture parameterized by a large number of parameters. Experimental results demonstrate that our method (HA-GP-UCB) works effectively on several benchmarks under resource constraints and malformed feedback settings.Comment: Included important citatio

arXiv.org e-Print Archive

A Distributed Variational Inference Framework for Unifying Parallel Sparse Gaussian Process Regression Models

Author: Bryan Kian
Hoang
Hoang
Hsiang Low
Minh Quang
Nghia Trong
Publication venue
Publication date: 03/04/2020
Field of study

Abstract This paper presents a novel distributed variational inference framework that unifies many parallel sparse Gaussian process regression (SGPR) models for scalable hyperparameter learning with big data. To achieve this, our framework exploits a structure of correlated noise process model that represents the observation noises as a finite realization of a high-order Gaussian Markov random process. By varying the Markov order and covariance function for the noise process model, different variational SGPR models result. This consequently allows the correlation structure of the noise process model to be characterized for which a particular variational SGPR model is optimal. We empirically evaluate the predictive performance and scalability of the distributed variational SGPR models unified by our framework on two real-world datasets

CiteSeerX

Fair yet Asymptotically Equal Collaborative Learning

Author: Foo Chuan-Sheng
Lin Xiaoqiang
Low Bryan Kian Hsiang
Ng See-Kiong
Xu Xinyi
Publication venue
Publication date: 09/06/2023
Field of study

In collaborative learning with streaming data, nodes (e.g., organizations) jointly and continuously learn a machine learning (ML) model by sharing the latest model updates computed from their latest streaming data. For the more resourceful nodes to be willing to share their model updates, they need to be fairly incentivized. This paper explores an incentive design that guarantees fairness so that nodes receive rewards commensurate to their contributions. Our approach leverages an explore-then-exploit formulation to estimate the nodes' contributions (i.e., exploration) for realizing our theoretically guaranteed fair incentives (i.e., exploitation). However, we observe a "rich get richer" phenomenon arising from the existing approaches to guarantee fairness and it discourages the participation of the less resourceful nodes. To remedy this, we additionally preserve asymptotic equality, i.e., less resourceful nodes achieve equal performance eventually to the more resourceful/"rich" nodes. We empirically demonstrate in two settings with real-world streaming data: federated online incremental learning and federated reinforcement learning, that our proposed approach outperforms existing baselines in fairness and learning performance while remaining competitive in preserving equality.Comment: Accepted to 40th International Conference on Machine Learning (ICML 2023), 37 page

arXiv.org e-Print Archive

Fault-Tolerant Federated Reinforcement Learning with Theoretical Guarantee

Author: Dai Zhongxiang
Fan Flint Xiaofeng
Jing Wei
Low Bryan Kian Hsiang
Ma Yining
Tan Cheston
Publication venue
Publication date: 03/11/2022
Field of study

The growing literature of Federated Learning (FL) has recently inspired Federated Reinforcement Learning (FRL) to encourage multiple agents to federatively build a better decision-making policy without sharing raw trajectories. Despite its promising applications, existing works on FRL fail to I) provide theoretical analysis on its convergence, and II) account for random system failures and adversarial attacks. Towards this end, we propose the first FRL framework the convergence of which is guaranteed and tolerant to less than half of the participating agents being random system failures or adversarial attackers. We prove that the sample efficiency of the proposed framework is guaranteed to improve with the number of agents and is able to account for such potential failures or attacks. All theoretical results are empirically verified on various RL benchmark tasks.Comment: Published at NeurIPS 2021. Extended version with proofs and additional experimental details and results. New version changes: reduced file size of figures; added a diagram illustrating the problem setting; added link to code on GitHub; modified proof for Theorem 6 (highlighted in red

arXiv.org e-Print Archive

Batch Bayesian Optimization for Replicable Experimental Design

Author: Dai Zhongxiang
Jaillet Patrick
Leong Richalynn
Low Bryan Kian Hsiang
Nguyen Quoc Phong
Tay Sebastian Shenghong
Urano Daisuke
Publication venue
Publication date: 02/11/2023
Field of study

Many real-world experimental design problems (a) evaluate multiple experimental conditions in parallel and (b) replicate each condition multiple times due to large and heteroscedastic observation noise. Given a fixed total budget, this naturally induces a trade-off between evaluating more unique conditions while replicating each of them fewer times vs. evaluating fewer unique conditions and replicating each more times. Moreover, in these problems, practitioners may be risk-averse and hence prefer an input with both good average performance and small variability. To tackle both challenges, we propose the Batch Thompson Sampling for Replicable Experimental Design (BTS-RED) framework, which encompasses three algorithms. Our BTS-RED-Known and BTS-RED-Unknown algorithms, for, respectively, known and unknown noise variance, choose the number of replications adaptively rather than deterministically such that an input with a larger noise variance is replicated more times. As a result, despite the noise heteroscedasticity, both algorithms enjoy a theoretical guarantee and are asymptotically no-regret. Our Mean-Var-BTS-RED algorithm aims at risk-averse optimization and is also asymptotically no-regret. We also show the effectiveness of our algorithms in two practical real-world applications: precision agriculture and AutoML.Comment: Accepted to NeurIPS 202

arXiv.org e-Print Archive